A Progressive Batching L-BFGS Method for Machine Learning

نویسندگان

  • Raghu Bollapragada
  • Dheevatsa Mudigere
  • Jorge Nocedal
  • Hao-Jun Michael Shi
  • Ping Tak Peter Tang
چکیده

The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components — progressive batching, a stochastic line search, and stable quasi-Newton updating — and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hessian Free Optimization Methods for Machine Learning Problems

In this article, we describe the algorithm and study the performance of a Hessian free optimization technique applied to machine learning problems. We implement the commonly used black box model for optimization and solve a particular challenging recursive neural network learning problem, which exhibits a non-convex and non-di erentiable function output. In order to adapt the method to machine ...

متن کامل

Large-scale L-BFGS using MapReduce

L-BFGS has been applied as an effective parameter estimation method for various machine learning algorithms since 1980s. With an increasing demand to deal with massive instances and variables, it is important to scale up and parallelize L-BFGS effectively in a distributed system. In this paper, we study the problem of parallelizing the L-BFGS algorithm in large clusters of tens of thousands of ...

متن کامل

A Robust Multi-Batch L-BFGS Method for Machine Learning

This paper describes an implementation of the L-BFGS method designed to deal with two adversarial situations. The first occurs in distributed computing environments where some of the computational nodes devoted to the evaluation of the function and gradient are unable to return results on time. A similar challenge occurs in a multi-batch approach in which the data points used to compute functio...

متن کامل

Statistically adaptive learning for a general class of cost functions (SA L-BFGS)

We present a system that enables rapid model experimentation for tera-scale machine learning with trillions of non-zero features, billions of training examples, and millions of parameters. Our contribution to the literature is a new method (SA L-BFGS) for changing batch L-BFGS to perform in near real-time by using statistical tools to balance the contributions of previous weights, old training ...

متن کامل

Artificial Immune System for Single Machine Scheduling and Batching Problem in Supply Chain

This paper addresses a production and outbound distribution scheduling problem in which a set of jobs have to be process on a single machine for delivery to customers or to other machines for further processing. We assume that there is a sufficient number of vehicles and the delivery costs is independent of batch size but it is dependent on each trip. In this paper, we present an Artificial Imm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1802.05374  شماره 

صفحات  -

تاریخ انتشار 2018